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Process design, structuring the real-time program for the CLC, was one 
of the difficult aspects of SAFEGUARD software development. Initially, there 
were no significant guidelines or criteria. In the course of the project, 
basic process-design rules were developed and significant experience was 
acquired. Some techniques that emerged are the use of short-running, asyn- 
chronous tasks; overlays to minimize storage requirements; and multiple 
storing of programs to minimize processor queuing. 

I. INTRODUCTION 

Process design involves defining the characteristics, interrelation- 
ships, and organizational structure of the tasks that comprise the 
operating system and the applications software. It was one of the 
difficult aspects of Safeguard software development. Initially, there 
were no specific criteria to be followed. Several iterations were required 
to converge on the final process design. The purpose of this paper is to 
present some of the basic guidelines that evolved in the course of the 
Safeguard project. The guidelines included are those believed to be 
most workable and most applicable to a wide range of real-time soft- 
ware systems. 

II. GENERAL PROCESS-DESIGN GUIDELINES 

Major efforts in the process design involved selecting from among 
the available methods of enablement for tasks, selection of the time 
frames in which they would execute, and the definition of task priorities. 
(For a description of tasks and processor management, see lief. 1.) 

2.1 Task structure 

Initial investigation of possible process structures led to the use of 
both synchronous (time-enabled) tasking and asynchronous (event- 
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triggered) tasking. It was clear that critical processing had to be given 
high priority, and it was generally of a synchronous nature. Asyn- 
chronous tasks were to be used to fill the time slots between critical 
synchronous tasks and to provide a uniform distribution of processing 
among the available processors. This general approach had to be 
modified by a few additional considerations. First, low-priority asyn- 
chronous tasks must have a short run time or they will hold a processor 
too long, denying access to high-priority tasks. Second, it is generally 
more difficult to design and test a process which utilizes asynchronous 
tasks. Further, it is not always necessary to achieve a uniform work 
distribution, e.g., during the process initialization and termination 
sequence. An almost totally synchronous design was chosen for process 
initialization and termination tasks to facilitate design and testing. 

It is inefficient to enable a synchronous task, only to find that the 
task lias no data to process because a peripheral device has not com- 
pleted its transfer or because other tasks have not generated it. 
Ultimately, synchronous tasks were utilized when critical and periodic 
response was required and when the availability of data at the same 
frequency as task enablement could be guaranteed. 

The asynchronous, event-triggered task is enabled by the completion 
of an i/o transfer or by the successful completion of processing by a 
predecessor task or tasks. Each predecessor task can conditionally 
enable one or more successor tasks. A successor task is absolutely 
enabled, i.e., ready to run, only after all conditional enablement 
criteria have been satisfied. The predecessor-successor relationship of 
conditional enablement can also help alleviate data interference prob- 
lems. Table I depicts some of the process-design questions that were 
faced and the type of tasks used to answer these questions. 



Table 1 — 


Process design 


Problem Description 


Task Description 


Support high-frequency, high- 
accuracy endoatmospheric 
target track. 

Process intersite communications 
message traffic. 

(lenerate time-ordered, simulated 
radar replies during an 
exercise. 


Synchronous task whose frequency is at 
least as high as the update 
requirements. 

Asynchronous tasks whose trigger for 
enablement is the arrival of intersite 
communication messages. 

Both synchronous and asynchronous 
tasks. Tasks that generate the replies 
are synchronous. These tasks condi- 
tionally enable an asynchronous task 
which time-orders and outputs (he 
simulated replies. 
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2.2 Parallel processing 

There were several cases where identical processing had to be re- 
peated for several items in a short time frame. In this case, the through- 
put requirement exceeded that of a single processor. The solution to 
the problem was to parallel process, i.e., to define several tasks execut- 
ing identical code. Since the code was re-entrant, only one program 
copy was required even though each instance of the task could be 
separately controlled and separately enabled. Again, the structure of 
this processing could be synchronous, asynchronous, or a combination 
of both. It was found necessary to parallel process different types of 
tasks to take full advantage of the multiprocessor environment. 

Obviously, multiple-instance task use may cause processor queuing 
problems. These can be alleviated by storing one program copy for 
each task. The critical consideration determining the number of pro- 
gram copies needed is the response requirement on the tasks involved. 

2.3 Data interference 

One of the primary design goals was to maximize throughput of the 
processing system. A natural implication of this was an attempt, in the 
beginning, to multiprocess everything. This immediately triggered 
task-to-task data-interference problems. Reviewing the task-response 
requirements made it obvious that not only was it not necessary to 
multiprocess all tasks, but in many instances it was impossible. 

This observation led designers to take a closer look at task time- 
frame design and the serial-processing relationship among tasks. From 
these investigations evolved two basic task-design guidelines for 
avoiding data interference. If possible, competing tasks should be 
assigned to nonoverlapping time frames of possible execution.* If this 
could not be done, an attempt was made to establish predecessor- 
successor relationships among them. These techniques could be used 
only infrequently when tasks were competing for data. 

Since a large number of data-interference problems were not solvable 
by either of these techniques, attention was directed to data-base 
design. Many interference problems arose when only two tasks were 
in competition, one loading the data and the other processing them. In 
those instances where the competing tasks were accessing a variable 
number of data items each time executed and the response requirements 
on the task were not critical, a circular queue with an access mechanism 
called a take-load pointer was used. With this mechanism, the loading 
task uses the load pointer to control the writing of data. It never 



' A time frame is a time "window" in which a task is allowed to execute. 
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writes beyond the take pointer. The processing task uses the take 
pointer to control the reading of the data. It never takes beyond the 
load point. This technique alleviated about 10 percent of the inter- 
ference problems. 

When two high-frequency tasks with critical response-time require- 
ments were competing for data, a double-buffering technique was useful 
to avoid data interference. In this case, two tasks both execute at a 
high frequency and in the same time frame. One loads the data and the 
other processes it. The competition question was solved by dividing 
the data area into two identical buffers, one of which was being loaded 
while the other was being unloaded. When unloading was complete, 
the buffers were switched. This technique works, but was of limited 
applicability. 

As a final resort to solving interference problems, locking and un- 
locking conventions were used. These conventions required use of 
predefined program-logic sequences to lock and unlock data areas. 
These sequences relied on a special clc instruction called a "biased 
fetch" which was implemented for this purpose. (For a more complete 
description, see Ref. 2.) Locking will always work, provided locking 
conventions are observed and enforced. Improper use of locking has 
caused the integration effort many headaches. The improper use of 
locks will manifest itself in a thousand disguises. However, it was 
necessary to use locking to solve more than half of the interference 
cases. 

2.4 Discussion 

How well is the process working? How close does the process conform 
to the process-design requirements? These are two questions that were 
constantly asked. To answer them, a process performance-monitoring 
capability was implemented. The implementation relied on constant 
monitoring of "probe" or test points within the process. Implantation 
of these probes into the process and interpretation of the resulting 
data proved useful for fine tuning the design and verifying that the 
basic requirements were being met. This should have been done much 
earlier in the design cycle. Probes should be capable of furnishing such 
data as routine and subroutine execution timing ; the time differential 
between when a task is enabled and when it actually acquires a proces- 
sor ; minimum, maximum, and average task run times, etc. 

This section would be incomplete without a few words about the 
position of the process designer. It became obvious that the process 
designer must participate in program design and integration. He must 
do this to guarantee that the program designers do not stray from the 
process-design requirements on program timing and interfaces. He 
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must be part of the integration effort to ensure that the process design 
is actually implemented in the process. Furthermore, it was found that 
the process designer required this program design experience and 
integration experience to be able to accurately interpret performance 
data and to use it to refine the design of the process. 

III. SYSTEM SIZING CRITERIA 

Estimates of the number of processors, program stores, and variable 
stores needed to do the job were continually monitored in the light 
of the mission to be fulfilled by the system. System sizings were an 
iterative effort. As requirements solidified and understanding of them 
improved, as routine, subroutine, and data-base estimates improved, 
and as simulation tools for forecasting system loading improved, sizing 
estimates changed. 

3.1 System operating points as design input 

It was the process designers' responsibility to map system perform- 
ance requirements into the number of instructions needed to code these 
requirements, the amount of variable store required to support the 
data base, and the number of processors needed to meet throughput 
requirements. The design effort attempted to balance, on a system cost 
basis, the inevitable trade-offs among these three resources. 

To facilitate evaluation of the impact of the various trade-offs on 
process design, a contour or envelope of possible system operating 
points was developed. Points on this contour reflected maximum usage 
of one or more resources and/or maximum processing capability of one 
or more process functions. It soon became clear that there were not 
enough resources to support the "worst-case" condition for all process 
functions. Further, it was not only impossible to support the worst 
case, but not necessary, since all functions do not peak simultaneously. 
Once the contour was identified and a feasible and reasonable set of 
operating points selected from it, trade-offs could be thoroughly 
examined. 

After the operating point was selected, it was the responsibility of 
the process designers to ensure that the design supported it. It was this 
effort that required the continual resizing of the system to guarantee 
that it would fit into the resources available. 

3.2 Minimizing core requirements by the use of overlays 

As design proceeded, program storage resources were rapidly ex- 
hausted. Further investigation showed that there were certain sets of 
programs that were not required to be in core simultaneously since their 
functions were mutually exclusive. Another set of programs had such 
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"loose" timing requirements that they could be called in from a 
peripheral storage device prior to execution. Examples of such sets are 
hardware test programs, display update programs, and system initiali- 
zation programs. 

3.3 Load balancing 

One of the most critical factors that influenced selection of the system 
operating point was the need to maintain a balance between the 
capability of the application process and the exercise process ; that is, 
the exercise process must be capable of driving the application process 
at or above the system operating point. 3 

When planning for load balancing, two factors must be studied. 
These factors are the "immediate-response" processing requirements, 
representing a maximum allocation of resources applied for a short 
time, and the "long-term" or residual processing requirements, repre- 
senting the load over a typical processing cycle. 

Since the process had two basic time frames, one approximately 5 
to 10 ms and one approximately 50 to 100 ms, two levels of load balanc- 
ing were needed, short term and long term. Experience showed the 
most critical need for load balancing to be at the short-term level. It 
was also the most difficult to satisfy. Once the short-term problem was 
solved, the long-term problem disappeared. Short-term balancing was 
found to be extremely sensitive to changes in routine and subroutine 
execution times, and tuning the balance was always required. 

IV. ALLOCATION OF RESOURCES 

Consideration of possible process structures led to three basic alter- 
natives for the allocation of the most critical system resources, processor 
and radar time. The first alternative is fixed allocation in which the 
execution time frame of each task is fixed in nonreal time by the 
process designer. The second alternative is real-time allocation in which 
the execution time frame of each task is determined dynamically by a 
synchronous allocation task included in the process. The third alter- 
native is a combination of the previous two. 

Initially, fixed allocation with its heavy reliance on synchronous 
tasking was favored because it appeared to be easier to design and test, 
and its reactions to traffic were easier to predict. After study, this 
design was rejected because it resulted in a nonuniform distribution 
of the work which, it was thought, would result in unacceptable system 

performance. 

The second alternative to a process structure centered on attempting 
to allocate almost all resources in real time. This technique yields a 
much more uniform distribution of work among the processors and a 
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better utilization of resources ; however, designing and testing this type 
of process appeared to be very complex. In addition, it was decided 
that the uniformity of the distribution of work was not as critical as 
first thought. 

Process design eventually included both types of allocation. This 
combination allowed the process to be designed and tested in a 
timely manner and yielded a nearly uniform distribution of work, 
giving reasonable processor utilization. 

V. OVERLOAD RESPONSE REQUIREMENTS 

Safeguard process designers had to answer the question of what to 
do when there were more requests for service than could be accom- 
modated. Because it was felt that the inherent overload handling of 
the priority tasking structure was not sufficient, a predefined, fixed- 
response technique was developed. 

In this approach, a tunable processing load point was defined at 
which overload-response rules were invoked. The exact rule to be used 
depended on the outcome of an overload function which "predicted" 
processor usage for the next cycle. This prediction was done by sum- 
ming selected system-traffic components weighted by an appropriate 
factor. Depending upon predicted processor usage, the execution of 
certain lower-priority tasks was curtailed. The higher the predicted 
usage, the more tasks were curtailed. Once the system entered over- 
load, it remained there for the duration of the engagement. 

This technique eliminated the additional testing and design required 
to implement a feedback type of overload response. The feedback 
technique was tried in the prototype system and was found to be 
impractical. 

VI. MULTIPROCESSOR QUEUING PROBLEMS 

Minimizing task run times was of critical importance for certain 
process functions; e.g., endoatmospheric tracking. Generally, functions 
with critical response times were also those functions selected for 
multiprocessing. This quickly led to a realization of the impact on task 
run time of processors queuing for instructions. 

A decision had to be made either to use multiple copies of multiple- 
instance parallel tasks or to divide the program into subunits. The final 
decision was based on each task's response requirement. For example, 
in one instance five identical tasks executing from a single program 
copy ran 77 percent longer than single-processor run time. The same 
programs were suitably subdivided and partially distributed to five 
independently addressable storage units and run time was reduced to 
a level about 25 percent greater than single-processor run time. Of 
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course, if five complete copies were stored in five different independently 
addressable storage units, there would be no increase in the parallel- 
tasking time versus single-processor execution. The final decision made 
was to use multiple program copies only for those tasks that always 
had to execute at maximum efficiency. This was done to conserve 
program storage. More commonly, large programs were divided into 
subunits distributed among program storage units in such a manner as 
to equalize the number of accesses per storage unit per time interval. 
This general technique was found to be sufficient for a large number of 
applications. 

VII. SUMMARY 

Initially, there were no significant guidelines to process design ; these 
were developed as design progressed. No claim is made that the criteria 
which evolved in our design are exhaustive, but they should be ap- 
plicable to a wide spectrum of real-time software systems. 

It was good design practice to use short-running, low-priority, 
asynchronous tasks wherever possible. This helped alleviate task 
scheduler conflict problems, which arose when there were a large 
number of high-priority synchronous tasks. It helped guarantee that 
high-frequency, high-priority tasks would execute at their specified 
frequency, and it also aided in achieving a more uniform work 
distribution. 

Data-interference problems arise naturally in a multiprocessing 
environment. The most useful technique to solve these problems was 
consistent use of software locking conventions ; however, improper im- 
plementation of these techniques caused problems during integration. 

To minimize system overhead and to avoid wasting processing time, 
tasks should be enabled only when they have work to do. Synchronous 
tasking should be used only if data are available to be processed at the 
same frequency as the enablement. 

Since it was essential to maintain a balance of capabilities between 
the application process and the exercise process, it was required that 
the interfaces between these processes be established as soon as possible 
and that their integrity be rigidly maintained. 

Because it was necessary to measure how well the process was work- 
ing, it was found that performance probes should be included in the 
initial design and considerable thought should be given to their correct 
placement. Performance probes proved invaluable throughout the 
system-integration process, particularly in helping to identify task- 
timing and queuing problems. Resolution of these problems requires 
that the process designer become deeply involved in the test-and- 
integration effort. 
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Finally, process design is iterative. For this reason, it is important 
that the design be kept as simple and straightforward as possible. This 
standard guideline of programming is even more important in process 
design because of the inherent complexity of the multiprocessing 
environment. 
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